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Abstract In this paper we present a machine vision 
system to efficiently monitor, analyze and present visual 
data acquired with a railway overhead gantry equipped 
with multiple cameras. This solution aims to improve 
the safety of daily life railway transportation in a two¬ 
fold manner: (1) by providing automatic algorithms 
that can process large imagery of trains (2) by helping 
train operators to keep attention on any possible mal¬ 
function. The system is designed with the latest cutting 
edge, high-rate visible and thermal cameras that ob¬ 
serve a train passing under an railway overhead gantry. 
The machine vision system is composed of three prin¬ 
cipal modules: (1) an automatic wagon identification 
system, recognizing the wagon ID according to the UIC 
classification of railway coaches; (2) a temperature mon¬ 
itoring system; (3) a system for the detection, localiza¬ 
tion and visualization of the pantograph of the train. 
These three machine vision modules process batch trains 
sequences and their resulting analysis are presented to 
an operator using a multitouch user interface. 

We detail all technical aspects of our multi-camera 
portal: the hardware requirements, the software devel¬ 
oped to deal with the high-frame rate cameras and 
ensure reliable acquisition, the algorithms proposed to 
solve each computer vision task, and the multitouch in¬ 
teraction and visualization interface. We evaluate each 
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component of our system on a dataset recorded in an 
ad-hoc railway test-bed, showing the potential of our 
proposed portal for train safety assessment. 

Keywords system • machine vision • train • safety 

1 Introduction 

In the last years train safety got the attention of media 
and public opinion after several disastrous train acci¬ 
dents, as those that happened in Italy in 2009 [14 ] and 
in France in 2013 [2]. Train accidents may be a result of 
either a problem on the railway tracks, as it was the case 
for the French accident, or some issues with the train 
itself. The analysis of the railway tracks requires the 
installation of sensors on board of a train that should 
travel on the tracks that have to be inspected. Several 
proposals have been made in this sense [6,9]. This work 
focuses on the safety assessment of the train itself. 

A train is composed of a locomotive and multiple 
wagons, any of its components can be a risk for the 
train safety. A single wagon failure can trigger the de¬ 
railment of several wagons and have dramatic conse¬ 
quences. The Viareggio accident [14] is believed to be 
the consequences of an axle failure on a tank wagon, the 
wagon hit the platform of the station and overturned to 
the left and several following wagons also overturned, 
exploded and caught fire. Another important aspect of 
train safety is temperature monitoring especially when 
the train is approaching a tunnel where escape in case 
of fire can be difficult. For example, the Kaprun disas¬ 
ter [21] was due to an electric fan heater that caught 
fire. Hence monitoring an abnormal temperature on any 
part of the train can provide an early notice of an is¬ 
sue and thus prevent its potential dramatic outcome. A 
train can be considered safe for transit if all wagons are 
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adapted for the transit on the railway, the locomotive 
and all wagons exhibit nominal temperatures and no 
out of shape elements are present. Failure to fulfill any 
of these requirements may indicate a risk situation. The 
analysis of each train status may be done by stopping 
and analyzing each train in a offtrack location before 
being allowed to travel. This would induce serious de¬ 
lays on the train traffic. Hence, more interest have been 
put on portal based system that could be installed on 
some important keypoints of the railway network (e.g. 
before a tunnel or before entering a train station) in or¬ 
der to asses on-the-fly all the safety requirements. This 
solution has the clear advantage of not requiring to stop 
the train to run the analysis and it is also possible to 
install multiple sensors on a single portal providing a 
thorough analysis of the train status at once. However, 
such portal based approaches require the monitoring 
system to be able to capture all required signals even 
for a train running at full speed. 

This paper depicts our proposed multi-camera por¬ 
tal for train safety assessment developed in the con¬ 
text of the Integrated Intermodal System for Security 
and Signaling on Rail (SISSI) project, funded by Re- 
gione Toscana (Italy). Our system relies on high-speed 
and thermal cameras to monitor several aspects of the 
train. The acquired signals are processed by computer 
vision methods to extract meaningful information. Fi¬ 
nally, all the information is provided to an operator 
through a touch-based user interface. We first review in 
the next section the state-of-the-art of computer vision 
based system for train safety and of touch-based inter¬ 
action for control rooms and train safety. We then give 
an in-depth presentation of our proposed system in sec¬ 
tion 3, specifying the hardware and giving an overview 
of the software developed to obtain reliable data acqui¬ 
sition from all sensors. In section 4, we detail how we 
solve each target task of the train analysis,namely the 
automatic wagon identification, the temperature mon¬ 
itoring, and the detection and localization of the pan¬ 
tograph of the train. We then describe how all the re¬ 
sults are provided to the operator on our multitouch 
interface. Finally in section 5, we give an evaluation of 
each sub-system of our multi-camera portal on a dataset 
recorded in an ad-hoc railway test-bed. 


2 Related work 

2.1 Computer vision based systems for train safety 

We can distinguish in the literature the approaches that 
target safety assessment of the train surroundings or the 
train itself. Some approaches focus on a single aspect of 


train safety, while multi-modal portals tries to analyze 
at the same times multiple safety features. 

Many railway accidents happen at railway crossing 
where an object such as a car is stopped on the railway, 
hence one common use of computer vision is to detect 
if an obstacle is obstructing the railway. Machine vision 
was used in [19] to detect moving obstacle in these loca¬ 
tions. A 3D vision system for obstacle detection is pro¬ 
posed in [24]. Train stations are also a risk environment, 
in [8] the authors present a method for automatically 
detecting people jumping or falling off a train platform. 

Another aspect to consider to assess train safety is 
the proper configuration of the train itself. In [20], a 
system to detect misalignment of a train pantograph is 
proposed. The authors of [12] proposed a multi-function 
portal similar in spirit to ours with the main objectives 
of detecting misalignment of carriage or abnormal tem¬ 
perature so as to be able to stop a train before it enters 
a tunnel. They rely on line-scan cameras to obtain the 
train image in the visible domain, pyroelectric line cam¬ 
eras for thermal imaging and a distributed time-of-flight 
telemeter for the train shape analysis. The evaluation 
targets mostly the sensors performance and only qual¬ 
itative results of out-of-shape detections are given. 


2.2 Touch-based interface for control rooms and train 
safety 

Operators in control rooms are often asked to moni¬ 
tor multiple safety characteristics and have to perform 
crucial security operations in a short amount of time. 
This is the main reason why information visualization 
and touch-based interaction play a key role when de¬ 
veloping a monitoring tool for a control room. Several 
studies [13,11] have been done to assess the benefits 
of the adoption of a multitouch workstation for tasks 
that require interaction with multiple visual cues. Re¬ 
sults have shown that multitouch interaction can be 
twice as fast as a mouse based one. Furthermore, mul¬ 
titouch interaction is often preferred by the users due to 
the direct manipulation of graphical elements offered, 
resulting in a more natural and effective approach to 
carry out the requested tasks. Touch-based interaction 
has been exploited in control and security process since 
the early seventies. In 1973, Beck and Stumpe [3] pro¬ 
posed a prototype of touchscreen device to control the 
new CERN accelerator. In recent years, studies were 
conducted to propose and evaluate good practices in 
the design process of touch-based interfaces for secu¬ 
rity operators. Zahler [26] proposes multiple patterns 
for the design of touch-based user interface for railways 
security and other safety-critical applications. In [4], the 
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Camera Model 

Producer 

Data type 

Matrix/Line 

Hz 

Resolution 

Data rate 

Temperature range 

# of Sensors 

HM-640 

Teledyne Dalsa 

Visual 

Matrix 

300 

640x480 

92MB/s 

- 

1 

Spyder 4K 

Teledyne Dalsa 

Visual 

Line 

18500 

4096x1 

80MB/s 

- 

2 

256L 

PYROLINE 

Thermal 

Line 

512 

256x1 

128KB/s 

[30°...800°] 

2 


Table 1 All our sensors references, characteristics and count. 

authors investigate the effectiveness of direct manipu¬ 
lation in multitouch interfaces for safety-critical situ¬ 
ations in maritime control room. Results showed that 
direct manipulation of interface elements can enhance 
situational awareness of users. 

Evaluation and testing safety-critical interfaces is 
crucial to show whether a novel developed system actu¬ 
ally fulfills its goals. Authors of [22] propose a method 
for the evaluation of user interface for safety in rail¬ 
way based on a high-fidelity simulator of an interlocking 
systems. Although many standardized usability evalu¬ 
ation methods exist and are commonly used for gen¬ 
eral purpose systems, some specific methods have been 
defined for the evaluation of safety critical interactive 
systems [23]. 


2.3 Contribution 

Our proposal is to use high-speed cameras mounted on 
a railway overhead gantry to monitor multiple aspects 
of the train. In particular, we have designed a machine 
vision system that coordinate different sensors with dif¬ 
ferent speed to: (1) automatically segment the wagon 
identifier according to the UIC classification of railway 
coaches; (2) extract the wagon temperature to prevent 
fires and flames on board; and (3) detect the panto¬ 
graph passage. 

We propose a touch-based interface that adopts in¬ 
teraction metaphors like direct manipulation and mul¬ 
titouch gestures. The goal of the touch-based user in¬ 
terface is to give operators a quick and efficient way 
to interact with results of the video sequences analy¬ 
sis. Manipulating all the output of the machine vision 
system, the operator is able to efficiently control the 
train’s safety requirements. 


3 Our multi-sensors portal 

In this section we describe the physical structure and 
the sensors characteristics of our multi-camera portal. 
We then detail the acquisition manager we designed 
to manage all the sensors together and deal with their 
high-rate acquisition. 


3.1 Architecture of the portal and sensors involved 

The portal is built over a single rail, has a height of 
8.5 meters and is 6 meters large. The distance from the 
train side is around 1.5 meters. This gantry is equipped 
with a total of 5 sensors, the Fig. 1 illustrate our portal 
configuration. 

We used three different types of high rate sensors. 
We summarize the characteristics of each sensor in Ta¬ 
ble 1. Two of this sensors work in the visual spectrum 
(one linear and the other matricial) while the last one 
works in the thermal spectrum (with linear acquisition). 

In particular, the matrix camera operating in the 
visual spectrum (Teledyne Dalsa HM-640) acquire 300 
grayscale images per second at a resolution of 640x480 
pixels. This camera is positioned on the top and at the 
center of the portal and is used as a general overview of 



Fig. 1 The portal equipped with all cameras. The visual ma¬ 
trix camera (HM-640) is on top in the center of the gantry, 
two thermal cameras (256L) are positioned on each side of 
the portal, and two visual line cameras (Spyder 4K) are po¬ 
sitioned on the same side but one on top of the other. 
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Fig. 2 Acquisition and processing system architecture 


the train passing by and for the visual analysis of the 
state of the pantograph. 

The linear camera operating in the visual spectrum 
(Teledyne Dalsa Spyder 4K) acquires 18500 grayscale 
lines per second with a height of 4096 pixels. We have 
positioned two linear cameras on the side of the portal: 
one is used to observe the bottom and central part of 
the train and is the entry signal of the wagon identifi¬ 
cation system, see section 4.1; the second linear camera 
is positioned higher on the side of the portal to capture 
the top of the train and it will be used for the detection 
of the pantograph detailed in section 4.3. 

Finally, the linear camera operating in the ther¬ 
mal spectrum (PYROLINE 256L) acquires 512 lines of 
256px each with a temperature range of [30°...800°]. 
One of this thermal sensor is positioned on the top of 
each side of the portal, and they are used to monitor the 
temperature anywhere in the passing train as explained 
in section 4.2. 

3.2 Acquisition framework 

Full synchronization of all the sensors can be really dif¬ 
ficult due to the high and different rates of acquisi¬ 
tion. However, each sensor being devoted to a specific 
function, we don’t need full and perfect synchroniza¬ 


tion of the acquisition. We designed a specific hard¬ 
ware/software solution that allows us to obtain a coarse 
synchronization between all the sensors, enabling a mean¬ 
ingful and easily interpretable playback of the acquisi¬ 
tion for the operator. 

As regards hardware, we designed three separate 
servers to deal with the large amount of data induced by 
the high-rate cameras mounted on the portal. Specifi¬ 
cally, there are 2 servers acquiring data from one linear 
and one thermal camera, while the third server deals 
only with the matricial camera as shown in the overview 
of our system architecture given in Fig. 2. These servers 
have 8 SAS disk in RAID-01 to obtain the sufficient 
speed needed (about 270MB/s) to write all the data 
generated, while maintaining sufficient reliability. 

Concerning the software, we designed an effective 
solution that allows us to contemporary control each 
sensor focusing on the optimization of CPU, memory 
and disk usage. In particular, we designed a frame- 
grabber for each type of sensor: matrix, linear, visual 
or thermal; exploiting the respective SDK given by the 
cameras vendor. 

As shown in Fig. 2 each frame grabber is controlled 
by a HTTP server (AcquisitionServer). For each cam¬ 
era this server implements some acquisition primitives 
(e.g. start, stop, pause) as well as some specific func- 
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tion depending on the camera model, for example, focus 
control for the thermal cameras. 

To contemporary control each AcquisitionServer (of 
each sensor) we designed another HTTP server called 
the AcquisitionManager. Each AcquisitionServer regis¬ 
ters to the AcquisitionManager and periodically send 
its state. Once an AcquisitionServer is registered to the 
AcquisitionManager it is possible, using a simple web 
interface we developed, to control the IP address, the 
state of the grabber, and all the primitives expected for 
the relative sensor. 

The acquisition primitives in common between all 
sensors are shown as a unique button in the web in¬ 
terface of the AcquisitionManager, while the primitives 
specific to a sensor are shown only for the registered 
sensor that can use them. Having a common interface 
showing the state of all sensors is particularly useful, 
for example, to prevent starting a new acquisition or 
stopping an ongoing acquisition if some of the Acquisi- 
tionServers is still saving some recently acquired data. 
When an AcquisitionServer is closed it unregisters from 
the AcquisitionManager. This software design offers the 
advantage that a sensor can be easily added, activated 
or deactivated for a specific acquisition. 


4 Train analysis 

The aim of the proposed system is to analyze both the 
visual and thermal data extracted using the high-rate 
sensors, described in the previous section. In particular, 
we developed three sub-systems in order to recognize 
the wagon identifier, monitor the temperature, and de¬ 
tect the pantograph. Finally, we will give an overview 
of the multitouch user interface we designed to enable 
an operator to interact with the processing results in 
the control room. 


4.1 Wagon identification 

The Wagon identifier subsystem aims to identify the 
wagon by segmenting its unique international identifi¬ 
cation number from the image acquired with the Visual 
Line Camera 1 positioned at the bottom right of the 
portal, see Fig. 1. From this identifier multiple charac¬ 
teristics can be extracted (type of wagon/locomotive, 
owner and country for example) and thus one can un¬ 
derstand if the wagon is expected and allowed to tran¬ 
sit on the monitored railway section. Due to the huge 
dimension of the image and the presence of noise we 
need to apply a robust identifier segmentation method. 
The whole method, described in Algorithm 1, relies on 


Algorithm 1: Wagon ID segmentation. 
Input: I, r£), <i, s, w, h 

Output: b 


1 

2 

3 

4 

5 

6 

7 

8 
9 

10 

11 

12 

13 

14 

15 

16 

17 

18 


Compute to = AdaptiveThreshold( I); 

Extract edge I e = CannyEdgeDetector(I,To)] 
Perform morphological dilation Ip = (I e © disk(rn )); 
Perform hole filling If = fill (Id)] 

Extract Connected Components 
c bbox — extractCC(If); 

Initialize votes v «— 0; 

Initialize j <— 0; 

Initialize k <— 0; 

while j < w do 

while k < h do 

C bbox — SelectCC (c bbox , J, /c, d ), 
in = RansacFitLine(c bbox )', 
v — v T Voting(c bbox , in); 
k = k + s; 

end 

j = j + s; 

end 

b = SegmentsalientRegions(I : c bbox , v); 


image processing and geometric analysis to obtain the 
position of the identification number in the image. 

Given an image of a wagon I of width w and height 
ft, we first apply an adaptive thresholding method to 
find the optimal threshold that separate foreground from 
background pixels [18], such as: 

to = AdaptiveThreshold( I) (1) 

The Otsu adaptive thresholding algorithm assumes that 
the image contains two classes of pixels (e.g. foreground 
and background) then calculates the optimum thresh¬ 
old separating these two classes in order to minimize 
intra-class variance. After that, the threshold to is used 
as input for the Canny edge detector [7] to segment the 
contour of the foreground elements present in the im¬ 
age. 

I e = CannyEdgeDetector(I, to )• (2) 

The regions defined by connected edges are filled first 
by using the morphological operation of dilation and 
then with a fill operation to definitely close small holes: 

I D = (I e © disk(r D )), (3) 

i / = mi D ), (4) 

where rjg represent the disk ray size used for the di¬ 
lation operation. Once those regions are filled, a con¬ 
nected components labelling algorithm is used to define 
the bounding boxes containing the blob regions previ¬ 
ously segmented: 

Cbbox = extractCC(If). 


( 5 ) 
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Since the connected components generally correspond 
to the foreground objects in the image, we can say that 
after the labelling we are able to know how many fore¬ 
ground objects are contained in the image and what are 
the pixels that belong to each object. The question re¬ 
maining to solve is which of these objects are characters 
of the wagon identifier. 

In order to identify the connected components cor¬ 
responding to characters we apply a voting procedure 
based on the sliding-window paradigm. In particular, 
given all the bounding boxes of the connected compo¬ 
nents we can infer that the characters of the identifier 
are close to each other and mostly aligned along a line. 
For this purpose, we apply a sliding-window procedure 
(with sampling step of s pixels) to the image and for 
each sub-window, of dimension d x d pixels, we estimate 
a line through the RANSAC algorithm [10] consider¬ 
ing only the bottom-right points of the bounding boxes 
present in that sub-windows: 

C bbox SdcctCC , j : k : d^j , (6) 

in = RansacFitLine^cwox ), (7) 

where j and k represent the top right coordinate of the 
sub-windows considered. In each sliding-window, the 
points in selected by RANSAC as inliers accumulate 
a vote. At the end of this procedure the points with 
the most votes will represent the bounding box with a 
higher probability of containing a character: 

V = V + Voting( c bbox , in). (8) 

Finally, to obtain the identifier which is composed 
of 12 characters we selected the sub-region of the image 
b containing the most voted and aligned foreground ob¬ 
jects. The alignment is estimated by computing the dis¬ 
tances on the x-axis D x and y-axis D y for the 20 most 
important foreground regions according to v. Then we 
take the exponential of the negative of these distances 
and we weight the votes previously obtained with those 
matrices separately. 

v w = exp (-D x ) * v + exp (-D y ) * v. (9) 

All those regions with a weighted vote v w greater than 
zero represents a character of the ID. We then take a 
crop of the original image as the region containing the 
set of ID characters, see the example in Fig. 3. From 
this image, any Optical Character Recognition (OCR) 
method can be applied to obtain the identifier. This 
identifier segmentation step is necessary as wagon im¬ 
age have an average size of 4096 x 80000 and cannot be 
processed as is by an OCR. 



4.2 Temperatures segmentation 

The Thermal monitoring subsystem acquires two ther¬ 
mal maps of each wagon and compare them to nominal 
operating temperatures in order to issue an alarm in 
case of fire risk due to abnormally high temperatures. 
The sensors involved in this subsystem are the Thermal 
Line Camera 1 and 2 positioned at the top right and 
left of the portal, see Fig. 1. 

Each thermal camera is connected and managed by 
a different server in order to ensure a higher robustness 
of the thermal subsystem through duplication. The mo¬ 
saic image obtained from all the acquired lines concate¬ 
nation, see examples in Fig. 4, is divided into subregions 
of fixed size and for each subregion both the mean and 
maximum temperatures are calculated. The minimum 
and maximum temperatures coming from one camera 
are compared with those extracted from the other cam¬ 
era in order to validate their output and ensure that 
both the servers and sensors are working correctly. 

An important phenomenon to be considered is the 
distortion of the temperature values caused by the per¬ 
spective in the image. In particular, the pixels furthest 
from the center of the sensor will be subject to a high 
distortion caused by the perspective between the sensor 
and the observed wagon (obviously this phenomenon 
depends also on the distance from the sensor to the 
wagon). For this reason we used two cameras observing 
the wagon from two different viewpoints. 



Fig. 4 Example of thermal images acquired by the two cam¬ 
eras. The hottest part (in red) is the locomotive engine. 
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Fig. 5 SIFT point extracted from the pantograph image tem¬ 
plate. 


4.3 Pantograph detection 

The Pantograph detection subsystem detects the pas¬ 
sage of the pantograph in order to avoid false positive 
cuts in a laser based system 1 that analyze the shape 
of each wagon. The pantograph detection is run on the 
data extracted from the Visual Line Camera 2 posi¬ 
tioned at the middle right of the portal. The obtained 
segmented high resolution image can also be analyzed 
by an operator to determine if there is any anomaly in 
the pantograph shape. 

The proposed solution is composed by an offline phase 
and an online phase. Offline, we extract SIFT [15] key- 
points from an image of the pantograph used as tem¬ 
plate T, see Fig. 5: 

D t = Extracts I FT { T). (10) 

The SIFT descriptors are representations of image re¬ 
gions highly discriminative and invariant to changes in 
brightness, scale and rotations. We store the extracted 
descriptors in a KD-Tree [16] in order to speedup the 
matching process: 

K t = KDTree(D T ). (11) 

Online, once the mosaic image of the train side is ob¬ 
tained, the proposed algorithm 2 extracts SIFT key- 
points from that image: 

D c = Extracts I FT (l c ), (12) 

The KD-Tree nearest neighbor search provides the 
identifier of the closest descriptor, however, due to the 
curse of dimensionality, descriptors neighbors in M 128 , 
could be not visually similar. For this reason a second 
filtering is introduced. The distance between the first 
and the second more similar descriptor is measured and 

i 

the match are discarded according to Vr < r^, where 
Td = 0.67, as in [5]: 

D m = MatchDescriptors( T, D c , Td). (13) 

However, these matches are not guaranteed to be cor¬ 
rect, this can occur for various reasons, for example 

1 This system is composed of three infrared laser mounted 
on the portal. This proprietary solution was developed by 
Thales Italia and cannot be discussed in the scope of this 
paper. 


Algorithm 2: Pantograph detection. 
Input: I c , T c , r d 
Output: p bbox 

1 Offline: 

2 D T = Extracts I FT { T); 

3 K x — KDTree( D T ); 

4 Online: 

5 D c = Extracts I FT {l c )\ 

6 D m = M atch Descript or s(T, D c , r^); 

7 [H, in] = RansacFitProj(Dm)', 

8 Pbbox — CheckGeomConsistency( H,m); 


repeated structures in the image or points with similar 
SIFT descriptors. For this reason a third validation step 
is performed by applying a geometric robust validation 
following a projective model transformation. This is ob¬ 
tained by exploiting the RANSAC algorithm [10] to fit 
a projective model and successively by applying a con¬ 
sistency check algorithm to the estimated homography 
in order to determine if the fitted model is correct: 

[H, in] = RansacFitProj(D m ); (14) 

Pbbox = CheckGeomConsistency{ H,in). (15) 

In this way it is possible to establish the presence of the 
pantograph and it is also possible to have an indication 
of its location in the image and segment the relative 
sub-image to be shown to the operator. 

4.4 Touch-based user interface 

To enable an operator to visualize and interact intu¬ 
itively with the results of the wagon analysis we de¬ 
veloped a touch-based graphical user interface (GUI) 
based on multitouch interactions. The aim of the GUI 
is twofold. On one hand, it is used to exhibit ah the re¬ 
sults of the acquisition and analysis to the security op¬ 
erator in a simple way so he can quickly get an overview 
of the train status. On the other hand it provides the 
operator with several tools for a direct and easy ma¬ 
nipulation of ah the data necessary to assess the train 
safety requirements. 

An operator first loads a session of a processed wagon 
analysis in the interface. A session is composed by (i) a 
frontal video of the train obtained from the matrix cam¬ 
era positioned at the top of our portal, (ii) two thermal 
images that are the output of the thermal monitoring 
subsystem, (iii) a high-resolution image of the train ac¬ 
quired by the visual linear camera 1. The frontal video 
can be played and scanned through a timeline visu- 
alizer. The timeline of the video is synchronized with 
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Fig. 6 (a) Overview of the interactive interface, (b) Examples of interaction with the multitouch user interface. On top the 
operator checks punctual temperature value, on the bottom she perform a zoom of an high-definition image of the train 


visual markers on both the thermal and linear imagery 
in order to give a visual time reference on all results. 

Thermal images are displayed with a false-color scale 
obtained from the temperature values. By default the 
scale is based on the max and the min values of the im¬ 
age, but the operator can manually change the range of 
colors in order to enhance the visualization of specific 
temperature values. Left and right thermal images can 
be activated with a selector, so that only one image at 
a time is visualised in the interface. 

The linear camera acquisition result is a high res¬ 
olution image of the train. In order to allow a fluid 
and smooth manipulation of this image we adopted a 
multi-resolution tiling technique [L]. Acquired images 
are pre-processed in order to have a set of downscaled 
versions of high-definition ones. Each downscaled ver¬ 
sion is then decomposed in tiles of 256x256 pixels. The 
rendering engine of the GUI loads and display only the 
tiles required for the current zoom level and portion of 
the image visualized by the operator, instead of load¬ 
ing the entire high-definition image. The operator can 
activate graphical overlays on the train image in order 
to visualize the results obtained by the pantograph de¬ 
tection subsystem and the wagon identifier subsystem. 

Fig. 6 shows an overview of the interface and some 
phases of the interaction of a control operator with the 
interactive GUI, like checking temperature of an area 
of the wagon or visualising an high-definition image of 
the train. The set of functionalities provided by the GUI 
allows the operator to have a quick overview of the im¬ 
age processing analysis results and to perform punctual 
and precise controls through direct manipulation using 
multitouch gestures. 


Fig. 7 Acquisition sample of a full HD video taken with a 
standard camera. 

5 System evaluation 

To evaluate our proposed system we recorded a dataset 
of sequences using the portal depicted in Fig. 1 in Poland 
(Zmigrod). We acquired 36 sequences of a train com¬ 
posed of one locomotive and one wagon on a test-bed 
railway track of 1 Km. The train passed under the 
multi-camera gantry at different time of the day, at dif¬ 
ferent speeds and with different weather conditions. To 
register these sequences we used the system architecture 
and the web interface previously described in section 3. 
In this section we will evaluate each of the sub-systems 
of our proposed approach. 

5.1 Wagon identification 

As shown in Fig. 7 it would be really difficult to seg¬ 
ment the identifier with a standard camera. Indeed, a 
high motion blur due to the train speed affects the read¬ 
ability of several characters. The use of a high-rate lin¬ 
ear camera is hence necessary to be able to properly 
segment the wagon identifier. To evaluate the wagon 
identifier segmentation performance, we estimated the 
accuracy of both the full train ID and the single charac¬ 
ters segmentation for each wagon. In particular, for the 
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Fig. 8 Example of false positive when two character regions 
(here 0 and 3) are merged. 


case of characters segmentation we count as true posi¬ 
tive every detected region that contains a character of 
the wagon ID, as false negative every missed character 
of the wagon ID and as false positive every region clas¬ 
sified as part of the ID but non containing a character 
of the wagon ID. While for the full ID segmentation, 
we count a true positive every time all the characters 
of the wagon ID are recognized, a false negative every 
time at least one character of the wagon ID is missed 
and as false positive all the regions classified as positive 
but that do not contain a character of the wagon ID. 
For the full ID segmentation evaluation, there is exactly 
one target detection by wagon making it easier to ob¬ 
tain higher false positive rate as any region that do not 
contains the ID will be counted as a false positive. 



Accuracy 

FN Rate 

FP Rate 

Wagon 1 

88.2 

11.8 

5.9 

Wagon 2 

100.0 

0 

14.71 


94.1 

5.9 

10.3 


Table 2 Full ID segmentation accuracy. 



Accuracy 

FN Rate 

FP Rate 

Wagon 1 

93.4 

6.6 

0.7 

Wagon 2 

100.0 

0 

1.2 


96.7 

3.3 

1.0 


Table 3 ID characters segmentation accuracy. 





Fig. 10 Examples of pantograph matching with different il¬ 
lumination conditions. The pantograph template is on the 
upper left side of each image. 


As it can be observed from Table 2 the accuracy of 
the system is very high for full ID segmentation in the 
case of wagon 1 while for the wagon 2 we are always 
able to detect the full train ID. We can also appreciate 
that both false positives and false negatives are lim¬ 
ited for the full train ID segmentation of each wagon. 
When evaluating in terms of character segmentation, 
see Table 3, the results are even better with really low 
false negative and false positive rates. False positives are 
mainly caused by the fact that sometimes small char¬ 
acter in the train ID are merged together and a region 
close to the train ID can be considered part of it, as 
shown in Fig. 8. 

In Fig. 9 we show a qualitative sample of how the 
proposed solution segment the train identifier, for both 
the locomotive and the wagon. One can observe how the 
train identifier is a very small part of the initial image 
and appreciate how our method successfully detects it. 


5.2 Pantograph detection 

The pantograph was observed only in the afternoon test 
session, so for 18 (out of 34) sequences of the dataset. 
However, for each one of the 18 sequences the panto¬ 
graph is correctly detected by the proposed solution. 

In Fig. 10 we report some samples of the panto¬ 
graph matching working under very different illumina¬ 
tion conditions. 


5.3 Evaluation of the user interface 

Cognitive Walkthrough (CW) [25] is a usability inspec¬ 
tion method whose objective is to identify usability 
problems, focusing on how easy it is for new users to 
accomplish predefined tasks. It is a technique that aims 
at detecting errors in design that would interfere with 
the performance of users while using the interface. CW 
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Fig. 9 Examples of text region segmentation on the two different wagons. In green the bounding box segmented after RANSAC 
refinement. 


is usually carried out by specialists in the field of inter¬ 
face development and usability experts. As the walk¬ 
through proceeds, comments of the users are recorded. 
We conducted an usability inspection of the proposed 
touch-based interface to assess its possible usability is¬ 
sues. For this purpose we defined the following different 
tasks: 

T1 Load the most recent analysed sequence 
T2 Position the video on the sequence corresponding 
to the pantograph detection 
T3 Visualize areas of the wagon which temperature is 
higher than 50° 

T4 Visualize the wagon ID number using the analysis 
results 

For each task we defined a sequence of actions with de¬ 
tails about specific task flow from beginning to end. We 
asked 5 examiners to perform the defined task using the 
so-called think aloud technique in order to record fail¬ 
ure in the interaction and design suggestions. Previous 
studies on usability testing [17] showed that the num¬ 
ber of usability problems U p found in a usability test 
is: 

(16) 


where n is the number of users, N is the total number of 
usability problems in the design and L is the proportion 
of usability problems discovered while testing a single 
user. The typical value of L is 31%, suggesting that 85% 
of usability issues can be found with 5 testers. 

All the examiners were able to complete assigned 
tasks, reporting usability and user interface related prob¬ 
lems while performing the evaluation. Feedbacks from 
examiners allowed us to identify and correct minor but 
important usability issues, mostly regarding sizes and 
positions of objects on the screen or ambiguities in the 
use of textual labels. 

6 Conclusions 

In this paper we introduced a multi-camera portal for 
train safety assessment. Our proposal is able to per¬ 
form the analysis of multiple safety requirements of each 
train passing under the gantry without requiring the 
train to be stopped. We detailed the hardware used and 
the software developed to robustly acquire data from 
multiple high-rate sensors. Image processing and com¬ 
puter vision methods are applied on each data stream 
to extract meaningful information. We also presented 


U P = N(1 — (1 — L) n ) 
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our multitouch interface that enables an operator to 
quickly observe and simply interact with the processed 
data. The evaluation has shown the good performances 
of the analysis and the usability of the interface. 
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